Project 1¶
Author: Maja Noack Date: 2025-09-29
part_1¶
Part 1: Perspective and orthographic projections¶
This question aims to help to understand how projection affects images. Using a camera (a phone camera is fine), take two photos of the same scene:
- Perspective projection: Take a normal photo, where lines appear to converge due to perspective (e.g., a desk or the edges of a building).
- Orthographic projection (approximate): Either zoom in from farther away or crop the central part of the image so that lines appear more parallel and less converging.
Choose any scene you like (indoor or outdoor), but make sure there are some straight edges in the images.
Question 1 (and submission requirements):
- Submit both images (perspective and approximate orthographic).
- On each image, draw or overlay lines along a few straight edges.
- Write a brief note (a few sentences) describing the difference you see between the two projections.
The parallel lines of the microwave seem to meet at a point in the perspective projection (left) as visible in the yellow elongated overlay lines. The image creates depth perception through this but it distorts the lines because parallel lines in reality like the sides of the microwave do not stay parallel in the image.The edges in the approximate orthographic projection which was photographed from afar (right) stay almost parallel. The orthographic projection produces a flat appearance of objects which eliminates depth indicators while maintaining accurate measurements of shape, proportions and parallel lines.
part_2¶
Part 2: Histogram Manipulation & Linear Filtering¶
(a) Histogram Equalization on Provided Images (required)
Question 1: You are given three test images (grayscale test image 1, 2, 3) with low or uneven contrast. The task is to:
- Compute the histogram of each image.
- Compute the cumulative distribution function (CDF) from the histogram.
- Use the CDF to perform histogram equalization.
- Display, for each test image:
- The original image and its histogram
- The equalized image and its histogram
Compute the histogram of each image.
Histogram: diagram for values are distribution over image/data (how many times do certain values appear) GOAL: Create a Histogram from the given Grayscale images
Gayscale has 256 shades of gray -> 256 bins in the histogram
Bin width is defined as w=(b−a)/k with a to b being the intervall (here a = 0 and b= 255) and k being the number of bins
Bin edges: [0,1,2,...,255,256]
- turn grayscale image into 1D array
- count occurences of the bins
- plot histrogram
Compute the cumulative distribution function (CDF) from the histogram.
CDF is defined as: $$ \text{CDF}[k] = \sum_{i=0}^{k} h[i] $$
Use the CDF to perform histogram equalization.
- normalize the CDF so it maps to the full intensity range
- replace each original pixel value by its corresponding value from the normalized CDF.
(b) Creative Task: Apply to Your Own Images (required)
Question 2: Find at least one of your own images with poor contrast (too dark, too bright, or low contrast). Apply your histogram equalization implementation to this image and show the result.
- Submit the original and equalized versions side by side (image and its histogram).
- Briefly describe how equalization changed the image and whether you think it improved the quality.
Observation:
The histogram equalization process distributed pixel intensities across the complete 0–255 range which resulted in better contrast and revealed previously hidden details. The separate application of equalization to each R, G and B channel resulted in color distortions throughout the image.
The first image shows improved separation between blinds and interior shadows and the dark wall shows additional details. The equalized version has a better contrast yet it lost some of the natural-looking gradual tones which made the original image appear more authentic. The equalization process in the second image produced an intense bright sky and sunlight while simultaneously increasing the contrast and details in the field and trees. The grass turned into an unnatural green-blue hue while the color balance shifted toward purple which decreased realism. For this image I prefer the original version. The equalization process in the third image produced enhanced contrast between the person and shoreline and sky which exposed additional details in the face of the person and the surrounding vegetation. The equalized version colors also deviate from their original appearance although not as drastically as in image 2. For image 1 and 3 I prefer the equalized version.
The three images received better contrast and visibility through equalization but the process created color distortions. The technique works well for detail enhancement but it fails to maintain natural color accuracy.
(d) Derivative of Gaussian (Required)
In class, we saw that applying a Gaussian filter followed by a derivative filter is mathematically equivalent to directly applying a derivative of Gaussian (DoG) filter. Here, you will implement both approaches and compare the results.
Question 4: You are given three test images (grayscale test image 4, 5, 6), the task is to:
- Naive Approach (Two Steps)
- Smooth the image using a Gaussian filter.
- Apply a simple derivative filter (e.g., [-1, 0, 1]) in the x-direction and y-direction separately.
- Compute and visualize the results for both directions.
- Direct Approach (One Step) - 🆕 see the notes (PDF link) for how to derive and discretize the DoG filters.
- Construct a derivative of Gaussian filter in the x-direction and y-direction.
- Convolve the image directly with these filters.
- Visualize the results.
- Comparison
- Display the outputs from both approaches side by side.
- Briefly describe your observations: are the results nearly the same? Why might small differences appear?
Naive Approach
2D Gaussian function is:
$$ G(x, y) = \frac{1}{2 \pi \sigma^2} \; \exp\!\left(-\frac{x^2 + y^2}{2\sigma^2}\right) $$
(x,y) = distance from the kernel center.
σ = standard deviation (how much blur)
for Gaussian Smoothing apply gaussian filter on every pixel of the image
Apply a simple derivative filter (e.g., [-1, 0, 1]) in the x-direction and y-direction separately.
Derivate filter:
Compute and visualize the results for both directions.
Direct Approach (One Step)
2D Gaussian function:
$$ G_\sigma(x, y) \;=\; \frac{1}{2 \pi \sigma^2} \exp\!\left(-\frac{x^2 + y^2}{2\sigma^2}\right). $$
Using derivative theorem of convolution:
$$ (I * G_\sigma)' \;\equiv\; I * (G'_\sigma), $$
Derivative can be moved inside the kernel.
Partial derivatives of the Gaussian¶
derivative with respect to (x):
$$ \frac{\partial G_\sigma}{\partial x}(x,y) = -\frac{x}{\sigma^2} \, G_\sigma(x,y) $$
derivative with respect to (y):
$$ \frac{\partial G_\sigma}{\partial y}(x,y) = -\frac{y}{\sigma^2} \, G_\sigma(x,y) $$
The Gaussian is separable, so we can define a 1D Gaussian
$$ g(x) = \frac{1}{\sqrt{2 \pi}\sigma} \; \exp\!\left(-\frac{x^2}{2\sigma^2}\right), $$
and its derivative
$$ g'(x) = -\frac{x}{\sigma^2} \, g(x). $$
Then the 2D derivative-of-Gaussian kernels can be written as outer products:
$$ K_x = g'(x) \, g(y)^\top, \qquad K_y = g(x) \, g'(y)^\top. $$
Observation: The two methods produce similar edge detection results. However, Although same kernel size and sigma was used to be able to fairly compare the two apporach, the two-step method produces more defined results but does not filter out the noise as good whereas the DoG methods edges are less sharp and contrasted but all three images show a lot less noise. The difference is specifically visible for image 3 that has the strongest noise. DoG results reduce the noise drastically whereas the two step approach retains most of it. This is likely caused by the two step approach only using discrete filters and being an estimation of a true continous DoG so slight differences are to be expected.
(e) Creative Task: Image Sharpening (Required)
In class, we discussed that by subtracting a smoothed (low-pass) version of an image from the original, we can extract the edges and details (high-frequency details). Adding these details back into the original image enhances sharpness.
Question 5: You are given two test images (grayscale test image CT and moon), the task is to:
- Extract High-Frequency Details
- Smooth the image using a Gaussian filter.
- Subtract the smoothed version from the original image to obtain the high-frequency component.
- Sharpen the Image
- Add the high-frequency component back to the original image.
- Experiment with different blending weights (e.g., original + α × high-frequency, with α ranging from 1 to 10).
- Observe how the sharpness changes as you vary α.
- Creative Exploration
- Apply your sharpening method to your own images.
- Pick at least one example where sharpening makes the image look noticeably better, and one example where too much sharpening creates artifacts (e.g., amplified noise).
Observation: The image shows minimal changes for α= 1. For the moon image edges become slightly more defined and textures become more visible without introducing any noticeable artifacts. The CT image is slightly sharper but still very blurry. For higher α values in the range from 5 to 10 the image become very sharp exposing the fine craters of the moon and eliminating the blur from the ct image. The images become over-sharpened when α values exceed 25 as edges start to produce halos and micro-texture becomes grainy and noisy distorting to original images drastically and losing details.
Observation: Similar to the example images for very large alpha (α= 100) artifacts can be observed in all three images for example around the legs of the crawfish in image 3 or on the trunk in image 2. Additionally if α becomes too large the images looses contrast and details which is most appearend in image 1 where the cat is almost blurred with the backround. Good results without introducing artifacts can be seen for α= 15 for image 1, the cat is more contrasted and the image less noisy similar for α= 25 for the image 2 and 3.
part_3¶
Part 3: Anisotropic Diffusion¶
Question 1: Implementation of Anisotropic Diffusion
$$ \frac{\partial I}{\partial t} = \nabla \cdot (c(x,y,t) \nabla I) $$
where $c(x,y,t)$ is the diffusion coefficient that controls the amount of smoothing depending on the local image gradient. Use the following two diffusion functions:
$$ c(|\nabla I|) = e^{-(|\nabla I|/K)^2} $$
$$ c(|\nabla I|) = \frac{1}{1+(|\nabla I|/K)^2} $$
Run your implementation for different values of the parameter $K$ and different numbers of iterations. How does the choice of $K$ affect the result? What happens as you increase the number of iterations?
Observation 1 size of K:
For both diffusion function and images it can be said that the larger K the more blurred the image. The value of K determines the gradient size which distinguishes between noise/texture elements and actual edges. The conductance decreases rapidly when K is small so diffusion is strongly restricted. In this case for K<10 for both images and diffusion funtions clear edges are maintained with only slight cross-edge blurring.However, specifically for the weaker reciprocal diffusion this produces weak denoising effects within image regions. The conductance function c stays close to 1 when K is large because most gradients appear small compared to K which makes diffusion overall stronger smoothing edges more vigourisly. This is specifically apparent for K= 30 and reciprocal diffusion. In both images the egdes are strongly blurred. The exponential conductance function outperforms the reciprocal function at edge preservation because it decreases more rapidly with increasing gradient values when using the same K value.
Observation 2 Number of iterations (size of T):
The number of diffusion steps increases when T grows which extends the time needed for the process to achieve smoothing. The process produces light denoising results when t remains small because it effectively removes fine noise while preserving most details and edges. The image cleaning process becomes more pronounced in flat areas when t reaches moderate values while textures become less complex and strong edges remain mostly intact especially when using exponential conductance. The image transforms into flat sections during long T values because low-contrast elements fade away and edges expand slightly while detailed features disappear completely. The exponential conductance maintains edge preservation longer than the reciprocal form which causes interior areas to become smoother during increased t values. The total diffusion time depends on both the number of iterations and the size of step λ because t×λ represents the effective diffusion duration.
- Question 2: Comparison with Gaussian Smoothing
- Apply Gaussian smoothing to the same images.
- Compare results side-by-side: how does anisotropic diffusion preserve edges differently? Discuss the results and the differences you observed.
Observation:
Anisotropic diffusion maintains the clarity of areas within edges of contrat while it protects the definition of strong boundaries between them. The plain Gaussian blur treats all locations identically which leads to edge blurring because it performs pixel averaging across all areas including edges. The local amount of smoothing decreases near edges in anisotropic diffusion because intensity changes rapidly which helps boundaries stay sharp while flat or noisy areas become smoother.
The Gaussian filtering process in the image with the women reduces the noise but simultaneously produces a loss of definition in the hat brim and eye contours and feather details. Anisotropic diffusion on the other hand overall maintains edge definition although while simultaneously eliminating background noise. The exponential method maintains the most edge definition which results in tight boundaries with minimal loss of fine details but the reciprocal method produces a more uniform appearance with slightly diminished structure definition and increased blur.
The CT image shows that Gaussian smoothing creates a loss of definition between tissue boundaries while simultaneously decreasing the contrast between vessel and skull edges. Anisotropic diffusion removes noise but also the fine details inside the brain. The exponential variant maintains the most defined edges but produces slightly softer thin structures whereas the reciprocal variant produces the most homogeneous areas at the expense of slightly more edge blurring but maintains less strong edges better.